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Abstract 

The evolution of cooperation has been a perennial problem in evolutionary biology because co- 
operation can be undermined by selfish cheaters (or "free riders") that profit from cooperators but 
do not invest any resources themselves. Evolutionary game theory has been able to show that under 
certain conditions, cooperation nonetheless evolves stably, for example if players have the opportunity 
to punish free riders that benefit from a public good, but refuse to pay into a common pool. In the 
public goods game, cooperation emerges naturally if the synergy of the public good is sufficiently 
high. However, a necessary high synergy effectively constitutes a barrier to cooperation because it 
is rarely achieved in practice. Here we show that punishment reduces this barrier, and enables a 
transition from defecting towards cooperative behavior at synergy levels that could not support co- 
operation in the absence of punishment. We use an agent-based evolutionary simulation in which 
the agents' decisions to cooperate and to punish are encoded by genes that evolve via Darwinian 
evolution. We observe that punishment is beneficial for the evolutionary transition from defection 
to cooperation, but that once cooperation is established the punishment gene becomes unnecessary 
and drifts neutrally. Thus, punishment is absent in populations that defect and random in popula- 
tions that cooperate, but is crucial to catalyze the transition between those regimes, and leads to 
history-dependent effects. We conclude that punishment can be maintained as a low-cost guarantor 
of cooperation as long as intermittent eruptions of defection maintain the functionality of the punish- 
ing pathways, an equilibrium reminiscent of the establishment of global peace via a policy of nuclear 
deterrent. 



"Tragedy of the commons" is the name given to a social dilemma [l] that occurs when a number of 
individuals maximize their self-interest by exploiting a public good, and by doing so harm their (and 
others') own long-term interest. This is but one dilemma ^ that can be described within the framework 
of Evolutionary Game Theory (EGT) [sjjT]. While the tragedy of the commons is important in social 
science and politics (overfishing, and the destruction of the environment in general come to mind) , it also 
plays an important role in biology: both the evolution of virulence ^ and the manipulation of a host by 
a group of parasites [o] can be viewed as a dilemma of the public goods type. 



The public goods game is a standard of experimental economics 10-12 , where players possess a 
number of tokens that they can contribute to a common pool (the "investment" into the public good). 
The total contributed by the players is multiplied by a "synergy factor" , and this amount is then equally 
distributed to the players in the pool, irrespective of whether they have contributed or not. A group of 
players fares best if all the players contribute so as to take maximum advantage of the synergy, but this 
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behavior is vulnerable to "free-riders" that share in the pool but do not contribute themselves. Indeed, as 
can easily be shown, the rational Nash equilibrium of the game is to not to pay in, because this strategy 
dominates all others regardless of their play. 



It has been shown that punishment is an effective way to counteract defectors 13-23 . Because pun- 



ishment involves an additional cost to the co-operators that already invest into the public good 24 - 26 



these cooperators (termed "moralists" by Helbing et al. [23]) are themselves vulnerable to the invasion 
of non-punishing cooperators called "secondary free-riders" . As a consequence, we might expect that 
moralists ultimately become extinct, either because they were outcompeted by defectors, or by cooper- 
ating free-riders who benefit from the punishment without the associated cost. On the other hand, if 
moralists were ultimately successful in eliminating defectors, then there would effectively be no difference 
between cooperators that punish and those that do not, as no punishment ever takes place. Thus, in no 
case would punishment ever become the dominant strategy, but would only play a role at the boundary 
between defection and cooperation. 

It was recently shown that, instead, in the spatial version of the public goods game, moralists can 
win direct competitions 23 1 if the environmental conditions are favorable, namely if the cost and effect 
of punishment favors moralists over defectors. Spatial games, where the offspring of successful strategies 
are placed near the parent-and where as a consequence strategies are more likely to play against kin 



strategies-give rise to spatial reciprocity 19 , which appears to be the advantage that moralists need to 
gain superiority. 



According to References 23,27,28 , punishing cooperators do not fare so well, in contrast, in well- 
mixed populations. There, punishing cooperators appear to lose the fight against the cooperators that 
do not punish, and that catch a "free ride", as it were, on the costly punishment meted out by their 
moralist peers. As a consequence, defectors can spread. This is a surprising conclusion if it is meant to be 
unqualified, because at the very least it should be clear that, as long as the synergy gain is large enough, 
cooperators will be favored no matter whether the dynamics are spatial or well-mixed. Indeed, for a 
game with 5 players in the group for example (the case studied here and elsewhere), a five-fold synergy 
implies that the total payoff for a defector is the same as for a cooperator independent of the group he 
is in, and if the synergy is higher then that, not paying in is in fact detrimental to the individual. 

We believe that the solution to both conundrums-the survival of the moralists in the spatial game 
and the ineffectiveness of punishment in the well-mixed game-can be solved if punishment is not a binary 
choice (you are either a punisher or not), but is instead a stochastic decision where the probability to 
punish is shaped by the evolutionary process. Here, we show that if punishment is stochastic, spatial 
reciprocity is in fact not a necessary condition for the evolution of cooperation via punishment and 
the dominance of moralists. If stochastic strategies can evolve via Darwinian dynamics in a framework 
where decisions are encoded within genes that adapt to their environment, we can find conditions where 
cooperation evolves even without punishment, but absent those, punishment can promote the evolution 
of cooperation (as long as punishment is effective and cheap) in well-mixed populations. 

In previous work, we have investigated the evolution of stochastic strategies in the iterated Prisoner's 



Dilemma where players' decisions are conditional on their previous behavior 29 , and found that coop- 
eration is favored as long as the communication channel between players was reliable enough. In a sense, 
the public goods game is a multi-player Prisoner's Dilemma so we should expect similar dynamics, except 
that players in the public goods game do not remember previous plays. Thus, cooperation has to be en- 
sured by different means, for example by punishment. Still, many of the characteristics that we found in 
the stochastic implementation with a genetic basis we will encounter here too: strategies defined by genes 
encoding decision probabilities evolve towards a fixed point that is optimal given the selective pressures 
and environmental conditions. However, the selective pressures are determined by the population: if 
defectors are absent, for example, genes encoding probabilities that are only "expressed" if defectors are 
present drift neutrally. Thus, we do not expect that punishing cooperators are maintained after defectors 
have been driven to extinction in this scenario. When punishment is meaningless, it becomes random. 



3 



However, we will see that punishment is critical in the transition from defection to cooperation, playing 
the role of a catalyst. 

Results 

Evolutionary trajectories and fixed points 

We evolve stochastic strategies playing the public goods game with punishment in a well-mixed popula- 
tion, as described in Methods. Agents possess two genes: one (from now on called the "C gene") defining 
the probability to cooperate pc, and a gene that determines the probability pp to punish (from now on 
called the "P gene" ) . As the strategies adapt to the environmental conditions (specified by the parame- 
ters that define the game, as well as the spatial properties, the mutation rate, and the replacement rate), 
the probabilities change from their initial values (pciPp) — (0.5,0.5) towards the selected "fixed point" 
strategy. In order to visualize the evolutionary trajectory of a population, we reconstruct the evolutionary 
line of descent of an experiment (LOD, see Methods), which tells the story of that adaptation, mutation 
by mutation. While the LOD in each particular run can show probabilities varying wildly, averaging 
many such LODs can tell us about the selective pressures the populations face. In particular, averaging 
the probabilities on the LODs after they have settled down, can tell us the fixed point of evolutionary 
adaptation [29]. We determine this fixed point by discarding the first 250,000 updates of every run (the 
transient), along with the last 50,000 (in order to remove the dependence of the LOD on the randomly 
chosen anchor genotype) and averaging the remaining 200,000 updates. Note that this fixed point is a 
computational fixed point only: we do not mean to imply that the population's genotypes all end up on 
this exact point. Rather, due to the nature of the game and the selective pressures that change as the 
composition of the population changes, the evolutionary trajectories approach this point and then fluc- 
tuate around or near it. Thus, the fixed point reflects the mean successful strategy given the conditions 
of the game. 
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Figure 1. Evolutionary trajectories for different synergies. Evolution of strategies (pcPp) on 
the LOD for synergy factors r = 3 (black), r — 4 (green), and r — 5 (red). All trajectories originate at 
(0.5,0.5). We show an average of the LOD of 10 runs each. Here, /3 ~ 0.8, 7 = 0.2, and — 0.02. 

We show in Fig. [l] the average trajectories for three different synergy factors r = 3,4, and 5, all 
anchored at the random strategy {pc,Pp) = (0.5,0.5) that was used as the seed strategy for every 
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evolutionary run. We can see that, depending on the synergy (and the values chosen for the cost and 
effect of punishment), populations evolve towards a cooperating or defecting fixed point, and take different 
trajectories to get there. For r = 3, synergy is too low to lead to cooperation, and the fixed point of 
that trajectory is (pctPp) = (0,0), that is, defection. For r = 4, however, the population moves toward 
a fixed point centered around {pciPp) = (0-7,0.2), that is, players cooperate most of the time. (The 
location of the endpoint of the trajectory does not depend on the starting point.) Note, however, that 
the players engage in punishment only sparingly. For r = 5, cooperation is almost fully established, while 
punishment occurs about 40% of the time on average. However, the average trajectory (average over ten 
independent runs) is misleading here, because at this level of cooperation, the punishment gene has begun 
to drift. This is due to a substantially weakened selection on the punishment gene if players engage in 
defection only 5% of the time. An unselected probability pp is a uniformly distributed random number, 
with mean 1/2 and variance 1/12. As pc — > 1, the average pp and its variance approach precisely these 
numbers. 

When mapping the possible parameters /3 (effectiveness) and 7 (cost) of punishment (defined in 
Methods) each in the range from 0.0 to 1.0 and at low synergy r = 3.0, we find that defection is the most 



prevalent strategy on the LOD (see Figure [2j'V), as was found previously [22j23 . When 7 = there is 
no cost associated with the punishment, which implies that the P gene is not under selection and drifts. 
Thus, for this value of synergy (and lower), we find that the strategy fixed point is defection without 
punishment, except for the values 7 = 0, where punishment is random. 

As the degree of synergy increases to r = 3.5, cooperation starts to appear even in this well-mixed 
population (see Fig. [2|3), while it appears as early as r = 2 for sufficiently high /3 and low 7 in the 



spatial (but deterministic) version of the game, see 22 23 . For r = 4 we find players cooperating 
{Pc ~ 0.8) at high /3 and low 7 which indicates that under conditions where punishment is not very 
costly or even free, punishment pays off. In addition we notice that the probability to punish increases 
under the same conditions that allows cooperation (high /? and low 7, that is high impact, low cost 
of punishment), indicating that punishment is indeed used to enforce cooperation (Fig. [2p). The mean 
punishment probability grows to 0.5, but at the same time the variance shows that this gene is not under 
selection (as long as 7 7^ 0). 

Increasing the synergy level even higher towards r = 4.5 shows the emergence of dominance of 
cooperation {pc >0.5) for most of the range of punishment cost and effectiveness, see Figure [2JI). At 
the same time the punishment probability reaches 0.5 for a larger range of parameters, but the mean 
punishment probability on the LOD never exceeds 0.5, implying that full persistent punishment is not 
stable, and probably not necessary. Note that, in an implementation where decisions are deterministic 
(such as in the implementation of Helbing et al. (2^), punishment may remain for a long time in the 
population even though it is not selected anymore. In that case, players that cooperate with and without 
punishment have exactly the same fitness, and one or the other strategy should only dominate by drifting 
to fixation neutrally, a process that can take a significant amount of time in large populations such as 



those studied in Ref. 23 



Critical dynamics and the role of punishment 

Previously, a phase transition between cooperative and defective behavior in the public goods game as 



a function of the synergy r was observed for the spatial version 22 28 30 of the game (but not the 
well-mixed version). We can study the critical point and its dependence on punishment in detail in the 
well-mixed version of the game, where analytical predictions are available. We show in Fig. [3] the average 
probability to cooperate (solid line) and to punish (dashed line) as a function of synergy for our default 
values 7 = 0.2 and /3 = 0.8. Cooperation sets in at r = 4 and becomes prevalent for synergies just 
exceeding that. 

We will now study how punishment affects the critical point. The average probability of cooperation 
in Fig. [3] shows the typical behavior of an order parameter as a function of the critical parameter r. It 
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Figure 2. Mean probabilities for cooperation pc and punishment pp at the evolutionary 
fixed point. These graphs show the fixed point as a function of the cost of punishment 7 and the 
effectiveness of punishment /3, for different values of the synergy r. Left panel: probability to cooperate 
Pc, right panel: probability to punish pp. Note the inversion of the /3 and 7 scales for better visibility. 
Mutation rate is set to = 0.02. A: For r = 3, cooperation does not evolve except when punishment is 
free (7 = 0), and even then only if punishment is very effective (/3 close to 1). At 7 = 0, the punishment 
gene is neutral. B: For r = 3.5 defection is still the predominant strategy except for very low 7 and high 
/3. C: At r = 4, cooperation is fully established for low 7 and high /?, but not for medium values. D: 
For r = 4.5 cooperation is the dominant strategy for all values of the cost 7, and for high effect 
(/? > 0.75). Note that the average punishment probability pp never exceeds 0.5 (the value achieved 
when the gene drifts neutrally). 
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is instructive to run a control of the experiment where punishment does not exist. If we force pp ~ 0, 
cooperation does not set in until r = 4.5 (see inset in Fig. |3| and only becomes dominant at r = 5. 
Thus, although punishment is sporadic when it is possible-and drifts when cooperation is established-it 
is essential to lower the critical barrier for cooperation. The probability distribution of the punishment 
gene throughout the population (Fig. |4]) shows that punishment is never prevalent: it is absent below 
the critical point, and close to uniform above it. In a sense, punishment catalyzes the transition from 
defection to cooperation. Note also that the levels of cooperation achieved are significantly higher when 
punishment exists, even though punishment is only weakly selected for. Apparently, the possibility of 
punishment alone is sufficient to enforce higher levels of cooperation. 
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Figure 3. Mean probability of cooperation pc (solid, left scale) and probability of punishment pp 
(dashed, right scale) with adaptive punishment at the evolutionary fixed point of the trajectory, as a 
function of the synergy r (/3 = 0.8,7 = 0.2, fi = 0.02 (100 replicates for each data point). The 
probability to cooperate when punishment is forced to zero {pp = 0) is shown in the inset. 




We can calculate approximately the point at which cooperation is favored in a mean-field approach 
that does not take mutation and evolution into account, by writing Eqs. (5][6) in terms of the density of 



cooperators pc encountered by players in a group. Both naked cooperators and punishing cooperators 
(moralists) contribute to this density, i.e., pc = {Nc + Nm)/N', where N is the total number of players 
in the group. We can also introduce the mean density of punishers pp = {Nm + Ni)/N encountered by 
a player. Because the mean density of cooperators and punishers is the same for both cooperators and 
defectors in a well-mixed scenario (but not for spatial play!), we can then write 

„ kpc + 1 , . 

Pc = r—r-r-, 1 (1) 



and 



and we expect cooperation to be favored if 



Pc-Pd = ^ -l + l3pp>Q (3) 



or 



r>{k + l)[l-Ppp). (4) 
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This equation implies that the emergence of cooperation depends cruciahy on the density of punishers. 
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Figure 4. Histogram of the punishment probabihty distribution, in a typical equilibrated population, 
just before the critical point (r = 4, red), at the critical point (r — 4.15, green), and above rcrit {r — 4.5, 
blue). 

In fact, the mean- field theory predicts that cooperation in the absence of punishment is favored only at 
r = 5, We see cooperation emerge quite a bit earlier than that in our simulations (see inset in Fig. [s]), 
but crosses pc — 0.5 very close to r = 5, as predicted by the mean field theory. 

We can test Eq. Q by finding the critical r at which pc crosses 0.5 for simulations in which the 
punishment probability is held fixed, so that pp w pp. To find the critical point, we performed 100 
simulations each at fixed r with a resolution of Ar — 0.5 and interpolated data within the steep portion 
of the transition to find the crossover point. The critical line Tc = (fc + 1)(1 — (3pp) is indicated in Fig. [5] 
for fc = 4 and f3 = 0.5 (rc = 5 — 4pp). The mean field theory reproduces the experimental Tc within 
errors. 




Figure 5. Prediction of critical point at fixed punishment [Eq. (|4|, solid line] and extrapolated critical 
point at transition, for simulations in which the probability to punishment was kept fixed and constant. 
We used k — 4, /3 — 0.5, and 7 = 0.2. The error bars reflect the finite resolution Ar = 0.5. 



Because of the critical importance of punishers in determining the synergy level at which cooperation 
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emerges, the public goods game with a genetic basis implies curious dynamics close to the critical point. 
Below the critical point, defection is a stable strategy, and punishment is absent. Only when cooperation 
emerges as a possibility, punishment becomes more and more important, leading to a lowering of the 
critical synergy for cooperation via Eq. Q. Thus, cooperation emerges rapidly and decisively once 
a critical level has been achieved. Once cooperation is dominant and defectors are all but driven to 
extinction, punishment becomes irrelevant and the gene begins to drift. As this happens, the fraction of 
punishers drops, raising the critical synergy. Thus, a drifting punishment gene can lead to the sudden 
rc-cmcrgence of defectors as stable states. Once those have taken over, the reverse dynamics begins to 
unfold. In other words, we should observe periods of cooperation and defection that follow each other 
closely when the synergy is near the critical point. 

These dynamics are reminiscent of the phenomenon of supercooling and superheating in phase tran- 
sitions. If we imagine the synergy parameter r as the critical parameter and the mean probability to 
cooperate as the order parameter, it is possible that when r is slowly increased, the population remains in 
the defecting phase because a switch to cooperation requires a critical number of cooperators as a "seed" . 
In such a situation, the defecting phase is unstable to fluctuations. If a critical number of cooperators 
emerges by chance, punishment immediately becomes effective against defectors, lowers the critical point 
as implied by Eq. Q , and the population could transition to cooperation very quickly. A hallmark of such 
bi-stable systems that require nucleation events in order to transition is hysteresis, a phenomenon where 
the state of the system depends on its history. We can test whether hysteresis exists in the public goods 
game (and whether the strength of this effect depends on the probability to punish), by adiabatically 
changing the synergy parameter first from low to high (transitioning from defection to cooperation) , and 
then adiabatically back from high to low. While we see evidence of hysteresis even when punishment 




Figure 6. Population fraction of cooperators (measured as the density of non-punishing cooperators 
plus the density of moralists) as a function of synergy r when r is adiabatically changed from low to 
high values (red), and back from high values to low values (blue). All population fractions are started 
at 0.5 (either at the high or low end of r). The lines show the average over 100 runs. Standard error is 
of the size of the fluctuations. 



is absent (Fig. |6|V), the effect is much more pronounced when punishment is possible (Fig. [6p). The 
population moves from cooperation to defection at about the expected critical synergy Tcrit ~ 4.15 as r 
is decreased, but stays in the defecting phase much beyond the critical point as r is increased. 



Discussion 

We studied Darwinian evolution of stochastic strategies in the public goods game for well-mixed pop- 
ulations, using genes that encode the probabilities for cooperation and punishment. It is known that 
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punishment can drive the evolution of cooperation above a critical synergy level as long as there is a 



spatial structure in the environment 22 23 . It was also previously believed that in well-mixed popula- 



tions cooperation can only become successful if additional factors like reputation 19 or the potential for 
abstaining from the public good 30 31 are influencing the evolution. Here we show that cooperation 
readily emerges in a well-mixed environment above a critical level of synergy. This critical level is in- 
fluenced by a number of factors: the rate of punishment because punishment favors cooperating groups, 
but also mutation rate, because mutations can create cooperating groups by chance, and encourage a 
minimal punishment rate in order to keep mutated defectors at bay. Finally, spatial structure also affects 
the critical point as is well known ^22^ 28 , 30] , because a single cooperator can nucleate a transition simply 
because offspring cooperators are placed next to it, giving rise to a "bubble" of cooperators of sufScient 
size. 

We conclude that in well-mixed populations cooperation can emerge if the synergy outweighs the 
defectors' reward. If the mutation rate is low enough, the dearth of defectors in the cooperating phase 
makes punishment obsolete, that is, the selective pressure to punish disappears. Naturally, once this has 
occurred defectors can again gain a foothold, and the balance of power between cooperators and defectors 
could shift. Such a shift, however, reinstates the selective pressure to punish, leading to a re-emergence 
of moralists that can drive defectors out once more. Thus, for synergy factors near the critical point, we 
can expect oscillations between cooperators and defectors, and no strategy is ever stable. 

We have not studied here the possibility of "anti-social" punishment [32], where non-cooperating 
defectors can punish cooperators, but we do not expect this possibility to change the overall picture. 
Indeed, in simulations in which defection was not punished but instead rewarded (a negative punishment), 
this only served to reinforce the defecting phase. A transition to the cooperative phase still takes place 
at sufficiently hight synergy. Phase transitions between cooperative and defection phases have also been 
observed in a spatial version of the public goods game where costly rewards are given for cooperation. 



rather than the costly punishment for defectors 33 . It would be interesting to study this game within 
the context of evolving stochastic strategies. 

It is difficult to evade the analogy between punishment as a catalyzing agent of cooperation (while 
punishment is in fact rarely used), to the politics of a nuclear deterrent and mutually assured destruc- 
tion, where the threat of severe punishment alone is sufficient to ensure long periods of peace between 
superpowers. Previously, the game of "chicken" from the EGT literature was used to describe the politics 



of deterrence 34 , but in that game defection affected only the players not an entire community, and the 
punishment for uncooperative behavior was the action of defection itself. In the public goods game with 
punishment the punitive action is a reaction to defection, and its threat alone appears to be sufficient to 
realize peaceful coexistence for prolonged periods of time. 



Methods 

The public goods game emulates strategic decision making by groups, in which individual must select 
between different decisions that affect the group as a whole. Each individual in a group of fc -I- 1 players 
(A; = 4 in the present implementation) can decide to cooperate by making a contribution of 1 unit to the 
public good, while defecting individuals do not contribute. We encode this choice into a genetic locus as 
a probability pc, which can be thought of as the outcome of a network of genes that encode this decision. 
When mutating strategies, instead of mutating the individual genes that make up the decision pathway, 
we simply replace the parental probability pc by a uniformly drawn random number in the offspring. 

The sum of all contributions from cooperating players is multiplied by r (the synergy factor) and 
divided among all players. In addition, each player has the option to punish players who do not contribute. 
This decision is encoded into a different genetic locus with an independent probability pp. Following 
Helbing et al. f2^, those players that defect suffer a fine P/k levied by each punisher in the group, which 
costs each punisher a penalty of 7/fc. At each update, every player engages in a game with all its assigned 
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opponents. The number of cooperators Nc, defectors Nd^ moralists A^jv/ and immoralists (players who 
defect but also punish [23]) Nj is computed, and the payoff is assigned as follows: A cooperator receives 



while a defector takes away 



Moralists receive 



while immoralists earn 



Pc^.(^™±^-1, (5) 



p [Nc + Nm) , {Nm + Ni) 



M 



The population consists of 1,024 individuals who each have four assigned opponents. Since all op- 
ponents are also players, each individual plays five games per update. The choices of each individual 
are determined by their probabilities to cooperate pc and to punish pp. After each round, 2 percent 



of the population is replaced using a Moran-process 35 in a well-mixed fashion, that is, the identity of 
the players in the group is unrelated to their ancestry so that, effectively, the members of a particular 
playing group are randomly selected from the population. We verified that the probability for a player to 
encounter cooperators is independent of whether that player is a cooperator or a defector, as is required 
for well-mixed populations [36) . Players that are not replaced are allowed to accumulate their score, 
which is used to calculate the probability that this player's strategy will be chosen to replicate and fill 
the spot of a player that was removed in the Moran process. While the spatial version of the game shows 
somewhat different dynamics than studied here, we study the well-mixed version because it is amenable 
to theoretical prediction (see below). In fact, cooperation is harder to achieve in well-mixed populations, 
so most of our conclusions translate to the spatial version but with a lower synergy threshold. 

The two genes of every individual mutate with a probability when replicated. As mentioned earlier, 
mutating a probability replaces the probability with a uniformly distributed random number. After 



500,000 updates, the line of descent (LOD) of the population is reconstructed 37 38 , by picking a 
random organism of the final population and following its ancestry all the way back to the starting 
organism, which has pc = 0.5 and pp — 0.5. Because there is only one species in these populations, the 
LODs of the population coalesce to a single LOD (which is why it is sufficient to pick a random genotype 
for following the LOD). 
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